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We present Very Small Array (VSA) observations (centred on « 34 GHz) on scales « 
20 arcmin towards a complete, X-ray-flux-limited sample of seven clusters at redshift 
z < 0.1. Four of the clusters have significant Sunyaev-Zel'dovich (SZ) detections in 
the presence of CMB primordial anisotropy. For all seven, we use a Bayesian Markov- 
Chain-Monte-Carlo (MCMC) method for inference from the VSA data, with X-ray 
Jjj | priors on cluster positions and temperatures, and radio priors on sources. In this 

c/2 , context, the CMB primordial fluctuations are an additional source of Gaussian noise, 

and are included in the model as a non-diagonal covariance matrix derived from the 
known angular power spectrum. In addition, we make assumptions of /J-model gas 
distributions and of hydrostatic equilibrium, to evaluate probability densities for the 
gas mass (M gas ) and total mass (M r ) out to r2oo, the radius at which the average 
density enclosed is 200 times the critical density at the redshift of the cluster. This 
is further than has been done before and close to the classical value for a collapsed 
cluster. Our combined estimate of the gas fraction (/ gas = M gas /M r ) is 0.081q 04A- -1 . 
The random errors are poor (note however that the errors are higher than would have 
been obtained with the usual chi-squared method on the same data) but the control 
of bias is good. We have described the MCMC analysis method specifically in terms 
of SZ but hope the description will be of more general use. We find that the effects 
of primordial CMB contamination tend to be similar in the estimates of both M gas 
and M r over the narrow range of angular scales we are dealing with, so that there is 
little effect of primordials on / gas determination. Using our M T estimates we find a 
normalisation of the mass - temperature relation based on the profiles from the VSA 
cluster pressure maps that is in good agreement with recent M — T determinations 
from X-ray cluster measurements. 

Key words: cosmology: observations - cosmic microwave background - galax- 
ies: clusters: individual (Coma, A1795, A399, A401, A478, A2142, A2244) - X- 
rays :galaxies : clusters 
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1 INTRODUCTION 

Galaxy clusters have long been thought to pr ovide a faith- 
ful sample of cosmic baryonic matter (see e.g. IWhite et alJ 
Jl993h . lEvrardl il997Tl h One quantity often calculated and 



assessed in such work is the gas fraction / gas , which is de- 
fined as the (baryonic) gas mass over the total (baryonic plus 
dark matter) m ass of the cluster. We here pr esent Sunyaev 
Zel'dovich (SZ) dSunvaev fc Zel'dovich] ill972ft . see also e.g. 
iBirkinshawl dl999MCarlstrom et alJ J2002I) ) observations of 
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a sample of clusters, from which we infer / gas . Our random 
errors are high but the sample is complete, the redshifts 
deliberately low, and we are able to estimate / gas out to 
radii at which the overdensity of the enclosed region is close 
to the classical value of 178 for a collapsed object (see e.g. 
IPeacockl jl99fll^ . First we review some of the existing / gas 
measurements. 

A popular route in investigating cosmic baryonic mat- 
ter is the detailed study of the X-ray emission from clus- 
ter gas. For example, in an investig ation based on ROSAT 
PSPC data jEttori fc Fabianl l|l999L 1. a sample of 36 clus- 
ters of redshift 0.05 ^ z ^ 0.44 was used to measure / gas . 
Assumptions of isothermality and hydrostatic equilibrium 
were required. The resulting / gas distribution (within rsoo, 
that is, where the mean density inside this radius is 500 
times the critical density at the redshifts of the clusters) 
was centred on a value / gas (rsoo) = 0.168/i^ l j . Values for 
indivi dual clusters were found to vary between 0.101 and 
0.245. iMohr et alJ il998f ) also analysed PSPC data on 45 X- 
ray selected clusters, finding a mean / gas (r 500) o f 0.212fe^ n 15 
in a su bsample of 27 clusters hotter than 5 keV. lAllen et alJ 
(2002), following a similar route (supplemented by gravita- 
tional lensing information on the total mass) with Chan- 
dra imaging spectrometer data find, for a set of six clus- 
ters with 0.103 ^ z ^ 0.461, a mean / gas within r2soo 
of 0.113 ± 0.005/ifo 1 ' 5 for a A-CDM model, a very pre- 
cise determ i nation with very similar values for each cluster. 
lAllen et al.l i2003fl . with additional data, investigated the 
observed change of / gas with cosmology. 

Studies making use of the SZ effect have potential ad- 
vantages for gas and gravitational potential measurements 
(where the potential is obtained via calculation of the total 
mass). The X-ray signal is proportional to nj? (where n c is 
electron density), while the SZ signal is proportional to n e . 
This means that SZ is less biased to concentration and can 
constrain clumping. Although X-ray telescopes achieve ex- 
cellent signal to noise, they are restricted to observing the 
denser, inner regions of a cluster (e.g out to r2soo). With SZ 
it is possible to measure n e (r) over a larger range of r (e.g. 
clos e to the viria l radiu s) as less dynamic range is required. 

iMvers et ail (Il997f) used the OVRO 5.5m telescope to 
observe the SZ effect in 3 clusters at 3 2 GHz. With the add i- 
tion of the Coma cluster (observed bv lHerbig et al.Ul995h ). 
they obtain a gas fraction of / gas = 0.061 ± 0.011/i^ O q This 
sample of objects lies in the redshift range 0.023 ^ z ^ 
0.089 9, and includes thre e clusters which we also present 
here. iMason et alJ i200 lT) extend the sample to seven clus- 
ters, incorporating a further two discussed in this paper. The 
dat a were used to calcu late Hq.) 

ICreeo et all <200lft used the OVRO and BIMA arrays 
to make SZ observations of galaxy clusters at 30 GHz. The 
data were used to infer the gas mass and total mass, thus 
constraining / g (within rsoo) in 18 X-ray selected clusters 
in the redshift range 0.171 ^ z ^ 0.826. The mean value 
obtained for the full sample was / gaa = 0.08ll ) ' M j ,! 1 /i 1 ~ 1 . In 
addition, a 'fair' subsample is defined as the five most X-ray 
luminous clusters in the EMSS sample. These objects have 
redshift 0.328 ^ z ^ 0.826, and together give a mean gas 
fraction / gas = 0.089±g;gigA^. 

One of the aims of t he V SA project l|Watson et alJ 

J2003ft. iTavlor et al l (12003ft . IScott et al.l |2003f). 
iRubino-Martin et al.l <2003ft . iGrainge etlfll J2003^ . 



ISlosar et all d2003Fl iDickinson et~ail (l2004h . iRebolo et all 
120041) 7 has been to image nearby, massive clusters in SZ. 
The VSA baselines at ps 34 GHz couple well to the angular 
scales of such clusters. Here we describe SZ observations and 
cluster-parameter inferences of an X-ray selected, complete 
sample of seven clusters, with redshift 0.023 ^ z ^ 0.098 
and median 0.075. The age of the Universe at z = 0.075 is 
1.7 times its age at z — 0.55. The importance of low-z work 
is illustrated by the following two points: 

• The low redshifts of the clusters mean that they have 
particularly good X-ray data, and one can be reasonably 
confident that bright X-ray selected complete samples are 
in fact complete. 

• Since clusters grow under gravity, then on average low 
redshift clusters should be more evolved than those at higher 
redshift. Comparison of, for example, / gas in low- and high-zi 
samples is important. (Of course, we do not know how big 
the samples have to be to encompass meaningful averages). 

One immediate difficulty on these angular scales is con- 
tamination by CMB primordial anisotropy. At the start of 
this VSA observational programme, it was evident that we 
needed an analysis method that would apply the inference 
process correctly and would properly cope with error distri- 
butions in low signal-to-noise situations. There is the ad- 
ditional difficulty of dealing with (potentially variable) ra- 
dio sources at 34 GHz. This could be especially problematic 
where sources are in the clusters themselves rather than in 
the background: the low redshifts of the clusters imply such 
sources may be very bright. Accounting for these effects cor- 
rectly necessitates the exploration of the posterior probabil- 
ities of the parameters of a /3-model for the gas distribution 
given the VSA visibilities, receiver noise, the CMB and ra- 
dio sources. The method must also incorporate prior knowl- 
edge on e.g. the cluster positions from X-rays, and on source 
fluxes in a way which can cope with variability. We assume 
isothermality, and that the clusters are well described by hy- 
drostatic equilibrium. We use a Markov Chain Monte Carlo 
(MCMC) sampler (BayeSys) for an acceptable combination 
of speed and accuracy. 

In section 2 we briefly describe the relevant features 
of the VSA. In section 3 we present the sample, outline the 
data reduction pipeline and describe our strategy for dealing 
with radio sources. In section 4 we present our results, and 
attempt to describe the Bayesian analysis method in non- 
specialist terms. We make concluding comments in section 
5. 



2 THE VERY SMALL ARRAY 

The VSA is a 14-element interferometric telescope situated 
at the Observatorio del Teide, Tenerife. The observing fre- 
quency is tunable in the 26-36 GHz range, with a bandwidth 
of 1.5 GHz; at these frequencies observations should be rel- 
atively free from contamination by Galactic foregrounds for 
fields at high Galactic latitude. The 14 antennas are identi- 
cal. They rotate independently and are mounted on a tilting 
table thus allowing tracking in two dimensions. The table is 
surrounded by an aluminium shield to prevent groundspill. 

The telescope was desi gned to operate in t wo config- 
urations: Compact (see e.g. IWatson et all i2003l) for tech- 
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nical details) and Extended fsee iGrainge et al] (|2003)). All 
data in this paper were taken using the extended configura- 
tion. The Extended Array has 322-mm diameter illuminated 
apertures, resulting in a primary beam of 2.0° FWHM when 
operating at 34 GHz. The horn arrangement on the table 
allows for a range of baselines between approximately 40 cm 
and 3 m. The telescope is sensitive to angular sizes in the 
range 0.25° < 8 < 1.2°, and is ideal for observing low red- 
shift clusters. 

Radio sources are a problem in all cm-wave CMB ob- 
servations at all but the lowest angular resolutions, and SZ is 
no exception. The VSA design includes a dedicated source- 
subtraction telescope. This comprises two 3.7 m dishes lo- 
cated next to the main array and used as an interferometer 
with a 9m baseline, giving 4 arcmin resolution and a 9 ar- 
cmin field of view. The source-subtractor does not resolve 
any of the sources which we observe, but resolves out the 
CMB fluctuations. 

3 OBSERVATIONS 

3.1 Galaxy Clusters 

The VSA targe ts were selected from th e Northern ROSAT 
All-Sky Survey feohringer et alJ §000), NORAS hereafter) 
as the seven most X-ray luminous objects at redshift < 0.1. 
The clusters have rest-frame X-ray luminosity > 5 x 10 37 W 
in the 0.1-2.4 keV energy band. Additionally, only clusters 
observable from Tenerife and Cambridge were considered. 
This imposed declination limits of 10° < S < 60° . The upper 
limit is set by the latitude and configuration of the VSA 
main array. The lower limit is set by the need for the use of 
the Ryle Telescope (RT) as part of the source-subtraction 
strategy (see section 13.31 . Note that we have not applied 
any criteria concerning fluxes of contaminant radio sources. 
This is unlike the VSA primordial work, and indeed the SZ 
work of the RT and OVRO/BIMA. 

Pointing centres for the seven fields were defined based 
on the X-ray positions of the clusters as published in NO- 
RAS. Data for each target were obtained in a series of short 
observations made during the period October 2001-August 
2003. Repeat observations were required in several cases due 
to uncharacteristically persistent bad weather. The sample 
is summarised in Table along with published redshifts, 
temperatures used in our analysis, X-ray luminosities and 
total integration times of the VSA observations. The clus- 
ters A401 and A399 are only separated by around a degree, 
so were observed in a single pointing centred on A401. 

3.2 Calibration and Data Reduction 

The primary calibrator for all VSA observations is Jupiter. 
We based our calibration scale on the effect ive tempera- 
ture o f the planet at 34 GHz: T 34 = 155 ± 5 K jMason et alJ 
(1999)). The flux scale is transferred to our other calibra- 
tion sources: Cas A and Tau A. The calibrators are observed 
on a daily basis, allowing flux and phase calibration at reg- 
ular intervals. Cas A and Tau A are partially resolved on 
the longest VSA observations: we o vercome this problem 
by applying models as discussed in IGrainge et al] 12003 ) . 
Full details of the VSA calibration will be presented in a 



forthcoming paper. Note that in lDickinson et all i2004h and 
iRebolo et alJ 12004') we re-scale our calibration to agree with 
the recent WMAP results. 

The data reduction pipeline for galaxy clusters is identi- 
cal to that employed in the processing of our CMB data, and 
is presented in detail in lWatson et alJ J2003) . Each observa- 
tion is analysed independently using the reduce software, 
developed by the VSA team. The procedure is now highly 
developed, allowing virtually automatic correcting, flagging, 
filtering and re-weighting of the data. However, each raw 
data file must be checked by eye at least once to eliminate 
some 'bad' data (due to bad weather or telescope malfunc- 
tion), and to ensure optimum quality in the reduced data. 
It is also necessary to identify files requiring special filtering 
depending on where the Sun, Moon or a bright planet was 
during the observation. The resulting calibrated visibilities 
from each observation are taken and stacked together. 

The data were reduced independently by the groups at 
the Cavendish, the IAC and JBO, and the results found 
to be fully consistent. Approximately 28% of the data were 
discarded due to bad weather, filtering and telescope down- 
time. 

The form of data from the single baseline source- 
subtraction interferometer is identical to that of the main 
array and is processed in a similar way. The primary flux 
calibrator is NGC 7027. The flux scale from this is applied 
to our other flux calibrators. We use interleaved calibrators 
in order to monitor the telescope phase. 



3.3 Radio Sources 

Contamination by radio sources can be a large problem for 
CMB observations. The contribution goes as £ 2 so tends to 
be more problematic for the (often higher-resolution) SZ 
work than for primordial CMB observations. In order to 
map the SZ effect accurately, it is necessary to account for 
the effect of radio sources which may be part of, in front of, 
or behind the cluster. The VSA source-subtraction interfer- 
ometer allows potentially problematic sources to be observed 
simultaneously with main array observations of the cluster 
fields. 

As no high frequency («34 GHz) survey of the radio sky 
is available, we scheduled source observations via a two-fold 
approach: 

» The NVS S and GB6 catalogues JCondon et alJ <199fl) . 
iGregorv et alJ il996h 'l were examined for sources within a 
radius of 2° from the cluster centres. Source fluxes at 1.4 
and 4.9 GHz were used to perform a simple extrapolation 
to 30 GHz, thus making some prediction of the approximate 
level of contamination in the SZ observations. All sources 
with predicted flux greater than 50mJy were selected for 
observation with the VSA source-subtractor. 

• In order to account for flat or rising spectrum sources 
not seen at the lower frequencies, the RT was used to sur- 
vey the central square degree of ea ch field at 15 GHz wit h 
the rastering technique described bv lWaldram et all (2003). 
Peaks >20 mjy in the raster maps were recorded and the 
corresponding position list was added to the source subtrac- 
ter observing queue. This ensured that we accounted not 
only for all potentially bright sources in the field, but also 
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Table 1. The VSA cluster sample: Cluster coordinates (Bohrinecr ct al. 12000)), redshift IStruble fc RoodMl99ll) '). electron temperature 
iMarkcvit ch et alJ Jl99a) ). except Coma. iHughes et alj l|1988), X-ray luminosity I Boh ringer et alj |2000J)), integration time, map rms 
(outside the primary beam). 



Cluster RA Dec z T e L x T int rms 
(B1950) (B1950) (kcV) (10 37 W) (Hours) (Jy) 



Coma 


12 57 18.29 


28 12 28.5 


0.0232 


9.1 ±0.7 


7.01 


80 


0.021 


A1795 


13 46 34.43 


26 50 37.5 


0.0616 


7.8 ± 1.0 


9.93 


115 


0.020 


A399 


02 55 05.33 


12 50 57.6 


0.0715 


7.0 ±0.4 


6.78 


96 


0.030 


A401 


02 56 12.55 


13 22 50.1 


0.0748 


8.0 ±0.4 


11.76 


96 


(As A399) 


A478 


04 10 40.89 


10 20 26.0 


0.0882 




13.31 


74 


0.018 


A2142 


15 56 16.45 


27 22 08.0 


0.0899 


9.7+^ 


20.52 


73 


0.023 


A2244 


17 00 52.86 


34 07 54.5 


0.0980 


7 i+5.0 
' ■ 1 -2.2 


7.39 


91 


0.018 



for fainter sources which may have been present in the crit- 
ical central regions of the SZ fields. 

A summary of the source lists for all clusters is presented in 
Tabled including fluxes measured by the source-subtractor. 
The 15 GHz fluxes are those from RT pointed observations. 
Whereas for our primordial anisotropy work source fluxes 
were subtracted directly from the visibilities, we choose here 
to use our measured fluxes as priors in the Bayesian fitting 
software. Due to telescope malfunction at various stages dur- 
ing our observing schedule, not all sources were observed 
simultaneously with the corresponding cluster. In order to 
account for possible variability in the source flux, broader 
priors were used than would have been assumed otherwise. 
Directly subtracting source fluxes with such uncertainties 
would lead to biases when fitting to the SZ data. 

We can assess how much the SZ detections are affected 
by confusion noise from sources not fou nd in the above , as 
follows. A corollary of Scheuer's work iScheuerl (^H^)) is 
that confusion is worst when there is ~ 1 source per synthe- 
sised beam. Examination of Table [5] shows that in the RT 
surveying, at about 20 mjy there is less that one source per 
VSA average SZ synthesised beam. A rough extrapolation 
indicates that there is one source per beam at 34 GHz at a 
level of 10 mjy. Since the detected SZ fluxes are ~ 150 mjy, 
it is evident that the source strategy is adequate. 



4 RESULTS 
4.1 Maps 

The flagged and stacked data are held as visibility files, con- 
taining the real and imaginary part for each observed uv- 
position along with an associated rms noise level. Standard 
AIPS tasks are used to make maps, and to perform CLEAN- 
ing using one CLEAN box encompassing the area of the 
VSA primary beam. All analysis and parameter fitting is 
performed in the visibility plane; the maps presented here 
along with the resulting discussion are included purely to 
illustrate the results of our SZ programme. 

We expect a larger SZ response on the shortest base- 
lines, so an appropriate Gaussian taper is applied in each 
case. This emphasises structure on large scales. Taper val- 
ues were chosen based on the range of uv radii available 
in each cluster's data. In order to determine appropriate 
tapers for our sam ple, we used cluster parameters from 
iMason et alJ i200ll) (as listed in Table 01 to generate pre- 
dicted SZ profiles. These are shown in Figure (We ob- 



serve that the lMason et alJ <l200ll) value for the core radius 
of A399 (4.3 3 ± 0.45 arcmin) is in direct conflict with that 
reported bv ISanderson fc Ponmanl feOOSt (1.89 ± 0.36 ar- 
cmin). The use of lMasoii^t'al.l 's parameter may result in an 
over-estimate of the SZ flux from this cluster.) The chosen 
tapers are « O.lkA, although the taper for Coma would ide- 
ally be « 0.023kA. This cuts out nearly all Extended Array 
baselines, so a value of ~ O.lkA was used with good results. 
These maps of the VSA cluster sample are presented in Fig- 
ure y] The contours are 1.5cr, where a is the rms noise level 
presented in Table Q We comment on the significance of 
the detections in each map, and also the strength of the ob- 
served primordial features. We emphasise that this is not 
intended to be a quantitative analysis of the signal to noise 
ratio achieved for each cluster. 

4-1.1 Coma: Map (a) 

Coma is at redshift z = 0.0232, giving it an angular size on 
the sky roughly four times greater than any other cluster in 
the sample. It would ideally be observed on baselines even 
shorter than those of the VSA. However, the SZ signal from 
this cluster is so strong, we detect it at 7.5<r. 4.5er primordial 
features are visible around the SZ decrement. 

4.1.2 A1795: Map (b) 

A1795 is also detected at the 7.5cr level. This map contains 
a bright positive primordial feature south of the cluster. 

4.1.3 A399 and A401: Map (c) 

A399 does not appear in the map. We argue that this is 
most probably due to contamination by primordial CMB. 
Although the contours are negative at the position of A401, 
we suggest that this is largely due to the primordial decre- 
ment east of the cluster position. The SZ signal from the 
cluster may be contributing in part, but it is important not 
to confuse the two effects. The centre of the obvious decre- 
ment is around 15 arcmin away from the X-ray centre of 
A401. 

4.1.4 A418: Map (d) 

The A478 map shows a 6a SZ detection. Primordial CMB 
structures are visible all around the cluster, varying in 
strength from 3-4. 5a. 
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Table 2. Radio sources present in the cluster fields. The asterisked source was predicted to have flux less than 50mjy, but lMason et all 
( 2001) suggest it may be variable. 





RA 


Dec 


Predicted Flux 
34 GHz 


RT Survey 
15 GHz 


VSA Source-Subtractor 
34 GHz 




(B1950) 


(B1950) 


(mJy) 


(mJy) 


(mJy) 


Coma 


12 48 36 


+28 39 47 


75 




46 + 11 




12 49 25 


+28 07 55 


71 




29 + 9 




12 50 49 


+27 55 57 


99 




82 + 8 




12 51 46 


+27 53 41 


311 




250 + 3 




12 54 04 


+27 17 17 


57 




56 + 5 




12 55 36 


+28 36 36 


96 


49 ±3 


26 + 9 




12 56 08 


+29 25 19 


53 




10 + 12 




12 57 11 


+28 13 40 


- 


27 ±3 


34 + 7 




12 58 04 


+28 46 18 


226 


251 ± 13 


207 ± 10 




12 58 56 


+28 37 45 


- 


34 ±3 


31 + 5 




12 58 59 


+28 58 59 


168 




10 + 7 




12 59 58 


+27 25 17 


58 




49 + 9 




13 03 59 


+27 18 37 


52 




45 + 9 


A1795 


13 39 50 


+27 24 42 


521 




380 + 9 




13 45 45 


+25 16 01 


521 




12 + 7 




13 46 09 


+26 42 42 


89 




8 + 10 




13 46 34 


+26 50 25 


36 


51 ±3 


31 + 9 




13 49 03 


+27 19 48 


- 


8 + 3 


20 + 11 




13 49 41 


+25 24 17 


71 




7 + 6 


A399/A401 


02 53 51 


+13 22 25 


325 


342 ± 17 


235 ± 8 




02 55 24 


+13 40 10 


32* 




36 + 4 




02 55 47 


+13 22 19 


37 


52 ±3 


29 + 4 




02 56 01 


+11 31 00 


84 




54 + 9 




02 56 52 


+13 42 59 


35 


66 ±3 


26 + 5 




02 57 25 


+11 25 45 


60 




55 + 4 




02 58 34 


+13 03 53 


28 


17 ±3 


13 + 6 




02 59 48 


+12 07 18 


305 




107 + 9 




03 00 23 


+12 57 22 


80 




97 + 7 


A478 


04 08 52 


+08 35 38 


190 




61 + 12 




04 10 55 


+11 04 43 


836 




395 ± 9 




04 11 02 


+10 10 19 




14 ±3 


7 + 4 


A2142 


15 48 08 


+27 27 02 


166 




58 + 7 




15 52 28 


+27 55 35 


61 




2 + 6 




15 58 04 


+27 11 13 


163 




5 + 6 




15 58 57 


+26 53 35 




56 ±3 


17 + 6 




16 00 03 


+26 18 43 


57 




38 + 6 




16 00 35 


+26 54 15 


498 




176 ± 14 




16 04 54 


+27 25 22 


326 




186 ± 17 


A2244 


16 53 50 


+32 48 55 


88 




48 + 6 




16 56 12 


+34 48 01 


512 




297 ± 11 




17 06 12 


+33 50 37 


110 




95 + 8 



4.1.5 A2142: Map (e) 

The 7.5a detection of A2142 appears to be relatively free 
from bright primordial features. 

4.1.6 A2244: Map (f) 

A2244 does not appear in the map. Again, we suggest that 
the cluster may be coincident with a peak in the CMB. 

4.2 Cluster Model 

In the SZ effect, incident CMB photons are Compton scat- 
tered by the hot gas in a cluster's potential well. At frequen- 
cies less than 217 GHz, a brightness temperature decrement 



in the microwave background is observed. This is propor- 
tional to the 'Comptonisation parameter' 

V = / nekTdl, (1) 

which is proportional to the line integral of pressure through 
the cluster. This can be calculated from modelled gas density 
distributions. 

As we are working with specifically large-angular 
scale SZ data, contamination from primordial CMB fea- 
tures is considerable, thus adding an extra 'noise' term. 
(In our parameter inference, this is dealt with appro- 
priately as an additional source of Gaussian noise - see 
section I4.4H . This restricts us to a hig hly const r ained , 
simple model. We ch oose to follow |GjS£Q_£t_alJ j2f2fHl) 
in fitting a /3-model | Cavalier e fc FuscoTerniancT &H)7ai . 
ICavaliere fc Fusco-FemianxT ' i^Vsi) ) to the cluster visibili- 
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Figure 1. Predicted SZ profiles for the cluster sample. 
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ties. We too simplify the problem by assuming the clusters 
to be spherically symmetric and in hydrostatic equilibrium. 
(Note: Strictly the assumptions of isothermality, /3-profile, 
and hydrostatic equilibrium are incompatible. However, to 
good approximation, they a re compatibl e over a wide range 
of r for P close to 2/3. See iKind l ll96^ .1 In the /3-model, 
the gas density as a function of radius takes the form 



Pgas(r) = 



Pgas(0) 



(1 + (r/r c f) 



(2) 



where r c (core radius) and f3 are parameters of the fit. From 
the assumptions of hydrostatic equilibrium and gas isother- 
mality at temperature T, 



kT rfp gai 
p dr 



GM r 



(3) 



where M r is the total mass internal to r, p is the mass per 
particle, and k and G are the Boltzmann and gravitational 



constants. Equations and © lead to the following ex- 
pression for the total mass distribution: 



3/?r 3 kT 
(r-2 +r 2 )^G' 



(4) 



This can be adapted usefully to calculate cluster masses out 
to some overdensity, e.g. r2oo- 



M200 = 



47T 3 

— r 20 o(200pcrit 
Wrloo kT 



(•>) 
(6) 



In this work, we choose to calculate quantities out to T2oo as 
this is a good approximation to the virial radius of a cluster. 
Previous studies have used rsoo so we have also extended our 
calculations to produce results to this radius for comparison 
purposes. 

From the gas density distribution (5J it is straightfor- 
ward to compute the gas mass to this radius: 
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Figure 2. CLEANed VSA maps ((a)-(f)) of the clusters Coma, A1795, A399/A401 (where A399 is furthest south), A478, A2142 and 
A2244. The X— ray centre is marked in each case. The half-power CLEAN beam is shown in the bottom right corner of each plot, contours 
are 1.5(7 . Radio sources have been subtracted and the coordinates arc B1950. 
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A-kt Pgasdr 



47rp ga s(0)re 



x dx 



(1 + x 2 ) — 



(8) 



The above integral is evaluated numerically. We choose to 
parameterise in terms of M gas , and can solve for the gas 
density in order to compute the Comptonisation parameter. 
The calculated values can then be compared to real VSA 
data. 

The gas fraction is defined as 



M r . 



M r 



(9) 



in which M gas and M r are evaluated to the same radius. / gas 
evaluated by this method is proportional to h~ . One way 
to see this is as follows. In equation |H| the /i-dependences 
of the limit r2oo/r c cancel, p gas (0) is a local quantity and 
so not /i-dependent, and only r\\ depends on h because the 
third factor of r c is along the line of sight; thus M gas oc h~ 2 . 
In equation^! M t oc r 3 /(»" 2 + r 2 ) oc h' 1 . So, / gas oc ft -1 . 



4.3 Interferometric Data 

Interferometers sample the uw-plane so it follows that the 
most straightforward approach is to fit to the visibility data 
directly. This is further motivated by the following points. 
The instrument noise is Gaussian in the uv-plane, and in- 
dependent between visibilities. In the map plane the noise 
is highly correlated spatially. In addition, fitting to the vis- 
ibilities naturally avoids the problem of synthesised beam 
deconvolution. The primordial CMB is well understood in 
the MD-plane in terms of the measured power spectrum, so 
can be factored into the computation (see 14.41 for details) . 
Finally, the inclusion of point sources is straightforward. 



4.4 Contaminants 

There are two relevant astrophysical contaminants to the SZ 
data: primordial fluctuations in the CMB, and foreground 
radio sources. Emission from the Galaxy is taken to be neg- 
ligible in this analysis. 

Primordial CMB fluctuations, recognised as a source of 
Gaussian noise with known angular power spectrum, are in- 
cluded in a non-diagonal covariance matrix whe n calculating 
the m is fit between predicted a nd observed data llReese et alJ 
j2002|) . lMarshall et all <2003l) '). We observed bright primor- 
dial features in all of our cluster maps, and indeed they 
are evident in Figure [5] As the negative primordial features 
are of similar strengths and on similar angular scales to the 
cluster decrements, it is necessary to apply fairly tight po- 
sitional priors (see section I4.5H . As regards / gas estimates, 
we argue that the position is acceptable as the effect of the 
CMB tends to produce a cancelling effect on M gas and M T 
(see section \4~El . 

The point sources present in each field are also included 
in the model of the sky. The source-subtractor data allow 
the determination of the fluxes and positions of these ob- 
jects: we translate these measurements into appropriate pri- 



ors (see section I4.5P on the source parameters. These 'nui- 
sance parameters' are then marginalised out. 



4.5 Parameter Inference 

4-5. 1 Basic considerations 

In inferring cluster parameters, the traditional route fol- 
lowed in the literature is the Maximum Likelihood method. 
This metho d was used in , for e xample, the SZ and gas frac- 
tion work of lGreeo et alJ J2001T) . Computational restrictions 
at the time prevented the use of the fully Bayesian anal- 
ysis we perform in this paper. The likelihood of a dataset 
L(data|0) is the product of the probability distributions of 
the constituent data points, where 6 is used to characterise 
a set of parameters such as /3 and core radius. This likeli- 
hood may be maximised to find the best-fit value for each 
parameter of the set 6. This approach: 

(i) assumes that the parameters of a model have a true 
set of values, and that obtaining data from an appropriate 
experiment will measure this set of values; 

(ii) can be formulated in terms of a single misfit statistic 
when describing the difference between the predictions of 
a model and a measurement: maximising a Gaussian likeli- 
hood for data with uncorrelated errors is equivalent to min- 
imising the mean-squared residual, or chi-squared statistic; 

(iii) usually assumes Gaussian noise, although indeed this 
can be modified to incorporate the correct distribution (e.g. 
Poisson) for a particular case. 

The Maximum Likelihood method focuses on the es- 
timation of true parameters from data, while neglecting 
the full distributions for those parameters. When signal-to- 
noise is low, these distributions are broad and very unlikely 
to be Gaussian: we summarise the difficulties in this situa- 
tion as follows. 

Maximum Likelihood does not describe the joint process 
of observation and inference. We have a set of noisy visibil- 
ities (the data) which we attempt to explain by a model 
or hypothesis, H. The hypothesis includes the notions, for 
example, that the SZ signal comes from a gas distribution 
(which we assume here to have a /3-profile) and that sources 
and CMB primordials are present, and also the assumption 
that we understand the experiment in question (i.e. the in- 
terferometer works). The data model includes the param- 
eter set as defined above. We wish to estimate 6 from 
our data, that is, we wish to examine the probability dis- 
tribution P(0|data, H). N.B.: the notation P(A|B) refers to 
the probability of A given B. Rather than achieving this, the 
Maximum Likelihood method assesses the data while taking 
it as given that 6 has some true value, as outlined in point (i) 
above. In other words, it evaluates just the peak of the prob- 
ability distribution P(data|0,H). Application of Bayes' the- 
orem allows us to relate the two distributions P(0|data, H) 
(the posterior) and P(data|0,H) by 



P(0|data,H) = 



P(data|0,H)P(0|H) 



(10) 



P(data|H) 

The additional factors in equation llUl are the prior prob- 
ability distribution, P(0|H), and the evidence, P(data|H), to 
which we will return shortly. 
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In addition, point (ii) is not generally correct. Even if 
P(data|0, H) is Gaussian, it is multiplied by the prior P(0|H) 
which may, for example, be asymmetric. Once one starts to 
produce resultant probability density functions by multipli- 
cation the distributions are certainly going to be compli- 
cated. The probabilities outlined above are functions. The 
standard Maximum Likelihood approach characterises such 
probability distributions by a single value with an error bar. 
The characterisation of probability distributions with ap- 
proximate Gaussians is therefore misleading and may under- 
estimate the final uncertainty in a quantity such as f gas . It is 
clearly preferable to retain all the information contained in 
the entire function, rather than working with single-value 
parameters. As mentioned above, point (iii) can be dealt 
with appropriately. 

Propagating the likelihood function via Bayes' theorem 
thus overcomes points (i) and (ii) above. It also delivers 
additional advantages, summarised as follows: 

• Conditioning on a particular value of a parameter im- 
plies a delta-function prior, a state of knowledge that never 
occurs. It is now possible to deal with continuous proba- 
bility distribution functions in many dimensions (e.g. posi- 
tions, core radii, M r etc.) rather than having to work just 
with peaks and widths of artificially low-dimension prob- 
ability distributions. A desire to concentrate on a subset 
of interesting paramete rs leads dire ctly to the concept of 
marginalisation (see e.g. ISivial (Il996l) ) . 

• The method leads directly to the evaluation of the evi- 
dence, an extremely useful quantity that enables one to as- 
sess the relative sui tability of a set of hypotheses (see e.g. 
iHobson et all J2002h 1. 

The evidence in Equation I10H is P(data|H) and is an in- 
tegral over all parameters in N-dimensional parameter vec- 
tor 0: 

P(data|H) = yP(data|0,H)P(0|H)d N (11) 

This can be applied usefully to help distinguish between 
different hypotheses, say Hi and H2: Bayes' theorem (equa- 
tion IIUI I can be applied in order to evaluate and compare 
P(Hi|data) and P(H2|data). In doing this, P(data) cancels 
out and we obtain 

P(Hi|data) _ P(data|Hi) P(Hi) 

P(H 2 |data) ~ P(data|H 2 ) P(H 2 ) ^ ' 

Thus hypotheses may be compared. For example, we 
can evaluate the hypothesis that an SZ cluster is in a par- 
ticular, small patch of sky. We can compare this with the 
evidence given an alternative hypothesis, this time deem- 
ing that the cluster be found in a larger area of sky. The 
hypothesis probability ratio given in equation 1121 provides 
the means by which the suitability of these two priors can 
be assessed. Such additional information may be obtained 
from elsewhere; in this particular example X-ray data may 
be used to good effect. 

We note that both Maximum Likelihood and Bayesian 
methods can cope with correl a ted data (See e.g. 
iMarshall et~aH (l2003h . iReese et all (l2002ft as before) but 
simple chi-squared minimisation cannot. 



4-5.2 Characterising the posterior Probability Density 
Function (PDF) 

Having summarised the advantages of the Bayesian route, 
we now turn to the problem of calculating the posterior dis- 
tribution P(0|data, H). One method is to evaluate it as a 
product of the probabilities for every visibility, for all possi- 
ble values of each of the N parameters in 0. This is the 'brute 
force' approach, involving the calculation of the likelihood 
over a huge hypercube. This technique is now plausible for 
application to the CMB primordial power spectrum, given 
that the CMB itself has a Gaussian brightness probability 
distribution at every point on the sky (and is indeed the 
same everywhere). However, it is not a realistic approach 
for an SZ /3-model with position, mass and size uncertain- 
ties in the presence of the CMB and a number of radio 
sources. So we have chosen to represent the posterior in an 
approximate way by drawing samples from it, the Marko v 
Chain Monte Carlo meth od (s ee e.g. iGilks et alJ < 1996(1 . 
O Ruanaidh & Fitzgerald ( 1996) for general in tr oduct ions, 
and IMarshall et alJ (120031 1 . iBonamente et alJ <2004 for 
galaxy cluster specifics). 

This process results in a set of sample parameter vec- 
tors whose number density is proportional to the poste- 
rior probability, such that all local maxima are explored in 
proportion to their relevance. In order to ensure that the 
correct regions of parameter space are being probed, suffi- 
cient samples must be taken and calculations made. This 
is problematic in that it must be both accurate and effi- 
cient: to this e nd, we u s e the commercially available sam- 
pler 'BayeSys' (Skilling (2002)), a powerful code designed 
to be flexible enough to cope with a wide range of prob- 
lems. BayeSys makes use of a range of proposal distribution 
'engines' that govern where next to sample, and in particu- 
lar employs those that it finds dynamically to be most effi- 
cient for a particular posterior pdf. In addition, it should be 
possible to assess whether or not enough evaluations have 
been performed over an acceptable range of 0, that is when 
th e process has 'burnt in '. A review of such tests is given 
in lCowles fc CarlirJ <1996ft . We follow IMarshall et alJ (l2003h 
and argue that several short, independent burn-ins are a 
good idea to check that they agree. The diagnostic we use is 
the evidence itself, which we calculate by 'Thermody namic 
Integration' (see e.g. 16 Ruanaidh fc Fitzgerald! dl996T) L The 
method works as follows. The evidence (as given in equation 
ITTl is 

P(data|H)= / P(data|0, H)P(0|H)d N = E(l) (13) 

We now write down 

E(A) = J P A (dataj0,H)P(0|H)d N (14) 

BayeSys allows the running in parallel of several Markov 
chains (typically 10 in our case). The key to the method is 
as follows. The sampling starts with A = 0. This means 
that the new data are initially ignored with samples just 
being drawn from the prior. At this stage, remote regions 
of parameter space (that are at least allowed by the prior) 
are sampled. A is then gradually raised to one, at a rate 
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balancing the needs for computational speed and accuracy 
in the log evidence calculation. The latter can be shown to 
reduce to the numerical integral of the ensemble-averaged 
log-lik elihood with respect to A JO Ruanaidh fc FitzgeraidI 
(1993)). 



4-5.3 Practicalities 

It is always of utmost importance to ensure that one does 
not over-interpret the data available. This is crucial here, as 
we have not only fairly noisy data (due to the faint nature of 
the effect being studied), but also considerable contamina- 
tion from point sources and primordial CMB fluctuations. 
As is evident in the VSA data (figure |5J , and previously 
mentioned in section FP1 CMB features may be comparable 
in strength to the SZ decrement itself. It would be quite pos- 
sible to fit, accidentally, to a negative CMB feature which 
would be very misleading. Our method avoids this danger by 
including all contaminants in the model, and fitting all pa- 
rameters simultaneously. We have chosen to fit a simple but 
well-motivated model to our data, but even so we must fit 
six parameters plus source fluxes and positions. This makes 
the task computationally expensive (vastly more so than 
using Maximum Likelihood) . In order to extract parameters 
for a single cluster, around 100 hours of computer time is 
required (2 GHz processor). We do not expect to place tight 
constraints on, for example, j3 or r c and we anticipate broad 
probability distributions for all parameters. However, when 
we marginalise properly over all parameters we find some 
interesting precisions on /g as - 

In order to compare a sample model with the VSA data, 
we project the model gas pressure and map the Compton- 
isation onto a grid. A Fast Fourier Transform is then per- 
formed, and interpolated onto the u — v coordinates. These 
predicted visibilities are then compared to the observed clus- 
ter visibilities. Working directly with the visibilities has the 
advantages described in section 14.31 We deal with point 
sources and the CMB in the following natural way. The 
Fourier transform of a delta function is a constant ampli- 
tude sine wave. This can be used to increment all the pre- 
dicted visibilities by a factor specific to each source's sample 
parameters. The uncertainty on each measured visibility is 
Gaussian and has contributions from both the thermal noise 
in the receivers (which is uncorrelated) and the primordial 
CMB fluctuations (which are correlated between adjacent 
points in the u — v plane). The resultant noise covariance 
matrix C is non-diagonal but calculable given a primordial 
power spectrum, assumed to be well known. The likelihood 
of the visibility data is therefore 

p(d|0, H) = (270^11/3 exp R d - ^) Tc_1 ( d - d A > 

(15) 

where d and d p represent the observed and predicted visibil- 
ity vectors respectively, and iV v j a is the number of visibilities. 

The priors used to characterise the various model pa- 
rameters are summarised in Tabled As mentioned in section 
14.41 tight priors were placed on both the cluster position, and 
point source posi tions and fluxes. For t he cluster centroid, 
the X-ray centre feohringer et alJ (2000)1 was included as a 
Gaussian prior of width 1 arcmin. We chose to place a weak 
prior on core radius such that it be determined by the data 



Table 3. Priors for the cluster analysis. Positions and gas tem- 
peratures for individual clusters are quoted in Table HI 



Parameter 


Prior 


Position 


Gaussian, 1 arcmin 


r c 


Uniform, l-1000kpc 


P 


Uniform, 0.3—1.5 




Gaussian, ASCA value ±15% 


A/gas 


Uniform, (0.01 - 3.00) X 10 14 



to hand. The prior on the /3 parameter encompasses the ex- 
tremes of the range of values found in clusters to date. The 
temperature prior allows a g enerous error on th e fit. Note 
that /ga S depends on T 2 - see lGrego et alJ 11200 If) . The prior 
on the gas mass more than encompasses the accessible range. 
The point source fluxes included in the model were also as- 
signed Gaussian priors, based on the source-subtractor mea- 
surements and their uncertainty. The prior on each source 
flux was broadened to account for variability of a factor of 
1.33 times the measured flux: this step was only taken when 
the epoch of the source measurement was significantly dif- 
ferent from that of the cluster observation. For the sources 
selected using predictions from lower frequencies, positional 
accuracies were taken from the GB6 catalogue. The sources 
detected in the RT surveys were assumed to have positional 
uncertainty of ±40 arcsec in both RA and Dec; this is wide 
enough to cover even the weakest sources. 

4.6 The Effect of Primordials on / gas Estimates 

In the context of large angular scale SZ observations, the 
CMB is additional noise which will provide a source of error 
in the determination of / gaB . This extra noise was dealt with 
correctly when calculating cluster parameters (see Section 
14.41 . However, here we present a simple argument describ- 
ing why, in situations where the SZ data is used to infer 
both the gas mass and the total mass (as discussed in 14.21 . 
the contamination is not as catastrophic as one may antic- 
ipate. With the present data quality, fitting a /3-model is 
doing little more than fitting an offset plus a slope. If there 
is more negative signal due to a negative CMB feature co- 
inciding with the cluster position, then the M ga s estimate 
will be higher. (NB: This is a simplistic argument because 
of course the contribution to the Comptonisation parame- 
ter depends on the mass distribution which is linked to the 
total mass.) Now, in estimating M T , the effect of the above 
will be to increase the central concentration, increasing /3 or 
decreasing r c . Examination of Equation |H] shows that this 
effect will increase an estimation of M T . So, in this type of 
scenario, as both M gas and M T will be higher, the effects of 
the CMB tend to cancel out when calculating / gas for the 
cluster in question. A similar effect is observed for a bright 
primordial feature - the SZ signal will tend to decrease, and 
(3 will also decrease as the cluster will appear to be less 
centrally condensed. Thus, if the primordial CMB contami- 
nation happens to be correlated over the measured u-range, 
then the effects on M gas and M r tend to cancel, leaving / gas 
little affected. 

In general, depending on the actual sizes, shapes and 
positions of the primordial features behind the SZ decre- 
ment, /ga S may be pushed higher or lower, or remain rel- 
atively unaffected as outlined above. Of course, if there is 
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Table 4. Gas fraction estimations for A478 with the inclusion of 
a test contaminant source of flux S a dd at the cluster centre. 

"^add /gas 



(mJy) 



-100 


05fi+°' 088 
U.UOD_ Q Q41 


-50 


10+ 012 


-25 


12+ 014 
u,lz -0.07 





12+ 011 
u,lz -0.06 


25 


1S+ 014 


50 


11+° 13 

u,11 -0.06 


100 


10+ 009 
u - ±u -0.05 



a Universal value of /gas, then combining the results from 
a reasonable number of clusters will both help reduce any 
remaining effects and also help to evaluate the effect's mag- 
nitude. One may intuitively regard the cases presented above 
to be the 'worst case scenario', when in fact they appear not 
to cause too great a difficulty. 

We have performed a simple simulation in order to ex- 
amine this cancelling effect semi-quantitatively. Using our 
A478 data, we placed a test source of flux S a dd at the point- 
ing centre and re-calculated / gas . Results for test sources in 
the range —100 > Sadd > 100 mJy are presented in Table [I] 
Although this is by no means a rigorous test of the argument 
postulated, we note that the values of / gas for all Sadd are 
consistent within errors. This indicates that in this context 
(ie for our uu-range and chosen cluster sample), the effect 
of the CMB tends to cancel out in this context. Note that 
typical SZ fluxes are «150mJy, whereas CMB plus receiver 
noise will typically produce features of ~100mJy, and occa- 
sionally >150mJy. From these simple calculations, we argue 
that estimations of / gas should be relatively unaffected by 
the presence of primordial CMB in all but the worst cases. 

4.7 Other Effects 

° n /gas 

In this work, the random errors present are larger than 
any systematics, but here we present a brief discussion of 
some possible additional sources of error. Our assumptions 
of isothermality and sphericity may affect our inferred val- 
ues for /gas- If a cluster were not isothermal, we may, for 
instance, overestimate the temperature in the outer regions 
due to a temperature gradient, and may overestimate both 
the gas and total mass with a possible small net underes- 
timate of the gas fraction. Regarding asphericity, which we 
do not expect to have a large effect since we are not using 
X-ray surface brightness, we point out that our sample is 
orientation unbiased, because our flux limit is well above 
the flux limi t of the X-ray surv ey from which the clusters 
were chosen. forego et alJ i200 made mock observations of 
a simulated cluster population, finding no bias as a result of 
using a spherical isothermal /3-model, suggesting that these 
two sources of systematic error indeed may not be s ignif- 
icant in this work. Additionally, lArnaud et al.1 J20041) find 
that the temperature variation for clusters observed with 
XMM-Newton is l ess than 10% out t o half the virial ra- 
dius, and similarly IZhang et al.1 (12004) find errors on mass 
estimates from XMM-Newton data to be less than 25% as 



a result of temperature gradients. Generally, X-ray derived 
pressure maps seem to show a factor of two less variation, for 
example azimuthally around the cluster centre, than either 
density or temperature maps. Still, gas clumping could be 
a problem. Clumps, if unresolved, will lead to enhanced sig- 
nal in an X-ray map and thus bias the cluster temperature. 
This will artificially increase the inferred total mass. How- 
ever, the SZ data themselves are less sensitive to clumping 
as the SZ signal is proportional to n c rather than n^. Ulti- 
mately, the comparison of high signal-to-noise SZ data with 
X-ray measurements will constrain the level of clumping in 
clusters. 

4.8 Cluster Parameters 

We discuss the constraints placed on core radius and (3- 
parameter by the VSA data, and also present results for 
the gas mass, total mass and gas fractions calculated out to 
both V2oo and rsoo. For comparison, a summary of cluster 
parameters derived from X-ray data is presented in Table 

We find, as anticipated, that the cluster parameters /3 
and r c are poorly constrained by the SZ data, as shown in 
Figure 01 For Coma, A1795, A478 and A2142 there is con- 
siderable degeneracy between the two parameters. It is only 
possible to place limits on the two parameters together - 
little can be said about them as separate entities. This is 
largely due to the limited range of angular scales presented 
in this data, and indeed in any SZ data to date. Ideally, one 
would combine the VSA data with observations on smaller 
angular scales. This is impossible in this case, as instruments 
such as the RT would completely resol ve out signal f r om th e 
clusters in our sample. AMI (see e.g. iKneissl et ail fcOOlf) - ) 
will work over a larger range of angular scales and should 
start to break this degeneracy. A401, A399 and A2244 are 
not detected in the cluster maps, so it is perhaps unsur- 
prising that little constraint can be placed upon the shape 
parameters by these data. 

We present the median of the probability distribution 
for the gas mass, total mass and gas fraction for each clus- 
ter, evaluated to both r2oo and rsoo, in Table HJ The er- 
rors quoted are the values of the 16.5 th and the 83.5 th per- 
centiles. We note that A1795, A478 and A2142 all favour a 
gas mass of around 10 14 Mq. The Coma data allow very 
high gas masses. This may be interpreted as the cluster 
position coinciding with a negative feature in the CMB, 
thus making the SZ decrement appear deeper. The con- 
verse may be true for the other three clusters, in that their 
SZ signals may be partially 'obscured' by hot spots in the 
CMB. If this were true it would have the effect of reducing 
the preferred values of the gas mass, and indeed these ob- 
jects do allow low values of this parameter. (Note: although 
here we choose to f ollow | My ers et all l|W9jJ) in using X-ray 
temperatures from Mar kevitch et alJ <I1998T) . we recognise 
that more recent data are available. Repeating the analy- 
sis using XMM-Newton temperatures llPointecouteau et alJ 
j2004h . lSun et alJ i200$) ) we find that the resulting / gas val- 
ues are fully consistent with those presented in Figure |S] and 
Table|7| Any variations are below the random errors present 
in the VSA data). 

It is interesting to examine the constraints placed on 
the relationship between total mass M T and gas tempera- 
ture by the VSA SZ data. In Figure 21 we plot the X-ray 
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Table 5. Cluster paramet ers derived from X-ray data. References are [1] iMason fc Mversl <200(t . [2] iMohr et all ll»9Sft . [3] 
ISanderson & Ponmanl 120031) 





r c 






P 






n 




(arcmin) 








(10-3^/ 2 cm-3 


) (10-3^/ 2 cm-3 ) 




[1] 


[3] 


[1] 


[2] 


[3] 


[1] 


[2] 


Coma 


9.32 ±0.10 




0.670 


u. 'uo_ 046 




4 Kl+0.04 


, 19 +0.04 
CS '- L -0.04 


A1795 


2.17 ±0.28 


4 0l+ - 20 


0.698 


n 790+ 031 
u. <au_ 032 


0.83±0.02 


n.29±°f 7 


29.9ti.5 


A399 


4.33 ±0.45 


1 - 89 -0.36 


0.742 




0.53±0.05 


3 24 +0 - 14 
°' z -0.19 




A401 


2.26 ±0.41 


r, ,7 + 0.09 
z -°'-0.09 


0.636 


606+ 015 

U.DUD_g Q16 


0.63±0.01 


8.oitS:8 


c- 07+O.43 
°-°'-0.27 


A478 


1.00 ±0.15 


n o 4 +0.23 
z, ° -0.23 


0.638 


n 71 ,+0.030 
u. I ro_ .033 


0.75±0.01 


28.9tJ 5 9 2 


38.lt?.5 


A2142 


1.60 ±0.12 


o 14 +0.22 
J - i4 -0.22 


0.635 


n 7Q7+0.082 
u -'°<-0.093 


0.74±0.01 


15.03±; : g? 


15. 8^2 4 


A2244 


0.82 ±0.14 




0.580 


n 504+O O6 1 




17.73± 2 ;^ 






(e)A478 (f)A2142 (g)A2244 



Figure 3. Plots illustrating the constraints placed on /3-parameter and core radius by the cluster data. In each plot, the x-axis is /3 and 
the y-axis is core radius (kpc). 68% and 90% contours are shown. 



determined temperature and the total mass M r derived us- 
ing equation 6. We expect, of course, some scatter on the 
values of M T for each cluster due to the CMB contamina- 
tion of the SZ data. After examination of equation 6, we 
argue that the normalisation of our M — T relation is in fact 
mainly determined by the profile fitting parameters (3 and r c 
derived from the VSA data, and depends only weakly on Tx 
(T^ 1 ^ 2 for the self-similar 3/2 slope of the M-T relation.) 
This means in Figure 0]that the effect of any uncertainty in 
T (and consequently in A/500) for a given set of /3, r c from 
the VSA will move the data points within their large error 
boxes almost parallel to the slope of the M — T relation. For 
comparison we plot the normalisation of the M — T relations 



from hydrodynamical adiabatic simulations fevrard et alJ 
(1996)) and X-ray cluster data l|Finoguenov et al .1 l|200lla 
We calculate our normalisation constant for M oc T 3//2 to be 
2.33^0 78 x 10 13 . This is in good agreement wit h the recent 
M — T determinations derived from X-ray data jAllen et alJ 
fl200lLlPratt fc Arnavidl l)2002t ll. In a forthcoming paper we 
intend to investigate the possibility of determining the M—T 
relation from SZ without the use of an X-ray temperature. 
Such an M — T relation, based on a measurement of the 
global gas pressure distribution via the SZ effect, will be in- 
teresting to contrast with X-ray measurements. This kind of 
work will be very useful for the interpretation of upcoming 
SZ cluster surveys. 
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M 500 (1 14 h 1 M ) 



Figure 4. The mass-temperature scaling relation derived from fitting gas pressure profiles to the VSA SZ data. The temperature shows 
the X-ray temperatur es given in Ta bleland also enters M500 linearly. The dashed line uses the nor malisation from hydrodyn amical 
adiabatic simulations lEvrard et alj ll996t) ). and the solid line represents the best fit M — T relation of iFinoguenov et alj J200 it) . 



The / gas probability distributions arc highly non- 
Gaussian, and are plotted on the same axes in Figure|S] The 
errors quoted are the values of the 16.5 th and the 83.5 th per- 
centiles. In order to compare values for individual clusters, 
we summarise results from other experiments in Table 

We have combined the posterior probability density 
functions for each cluster gas fraction as follows (see Mar- 
shall (2004), in preparation, for more details). Simulating 
the effect of simultaneously fitting all our SZ data with the 
same global gas fraction / gas requires dividing out the prior 
on the individual cluster gas fraction (which ca n be derived 
from a set of MCMC samples with no data, see lSlosar et alj 
(2003)) and then multiplying the resulting effective likeli- 
hoods together. Modulating this product by the prior on 
/g as , which we take to be uniform over the range [0-0.2], 
gives us the posterior pdf P(/ gas |data). Moreover, keeping 
track of the normalisations allows us to compute a relative 
probability for the act of combination itself, that is, the ratio 
P(data|H global )/P(data|H i )), where H' is the hypothesis 'all 
clusters have independent gas fractions / ga s', whilst H global 
is the alternative hypothesis that 'all clusters have the same 
gas fraction / g as'- 



We first assume that all our clusters have one true 
global gas fraction value, / gas • We combine the individual 
probability density functions for all of our clusters, including 
those with what would classically be called non-detections. 
We find /gas/1100 = 0.023±°;°i2> 

with an evidence ratio in 
favour of this all-encompassing combination of 



P(data|H global ) 
P(data|H s ) 



= 4.4. 



(16) 



We can also divide the data into two sets, those from 
detected clusters and those from non-detections, and again 
investigate the suitability of their combination. Let hypothe- 
sis H|' c ° bal consist of the assertions that there is a global gas 
fraction / gas exhibited by the detected clusters, and that 
there is another gas fraction-like parameter X for the non- 
detections; we find the following evidence ratios: 



P(data(detections) |H; 



global \ 
det 1 



P(data(detections) | H 1 ) 



0.92, 



P(data(non — detections) |H 



global ^ 



P(data(non — detections) |FT) 



7.41. 



(17) 



(18) 
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Table 6. Gas masses, total masses and gas fractions for the VSA cluster sample evaluated to both T200 and rsoo- 



Cluster Mg as (r 2 oo)/i 2 M gas (r 5 oo)/i 2 M r2 ooh M T500 h / gas (r20o)h /gas^soo)^ 



10 13 M Q 10 13 A/ Q 10 14 M Q 10 14 A/q 



Coma 


15.4±1;° 


D -°-3.0 


10.9t 6 ° ° 


c o+5.5 
°- z -3.1 


u - io -o.io 


u ' lo -0.09 


A1795 




o c + 1-6 
o.o_ 1 7 


7 n+6.5 


o 4 +3.7 
°- -2.0 


12+ 015 
u ' lz -0.070 


n ,1+0.090 
u,11 -0.060 


A399 




o.7t£;S 


7 fi+ 5 - 3 
' - D -3.9 


q 1 +3.6 
° -2.0 


O.OSOt^ 4 


U.U28_ 02Q 


A401 


5.o±l;° 


3.o±i;l 


10.7i|;g 




o.M8±8:S 


n n55+ ' 055 

U.UJO_q 02 g 


A478 


11 2+ 4 ' 


°-'-2.2 


10.8+1° 


4.8± 3 ; 7 


12+ 011 
U-1 -0.06 




A2142 


n 2+ 40 


D - x -1.8 


i5.3±i:g 


7 q+4.6 


n n74+ - 068 

u - u ' 4 -0.034 


0.086ir i 


A2244 


1 0+1.6 
1 -°-0.8 


4 4+8-4 


™ + 4i 


3 q+3.8 
^ — 2.1 


020+ 039 
u.uzu_ 015 


0.020±g;°?i 



Table 7. Gas fractions estim ated within Rq from SZ data ffll lMvers et alj ll997ft V and within rsoo from X-ray data ( [2] iMason fc Myers! 
fcOOrl . [3l lMohr et ail ll99Sft . [4l lEttori fc Eabiad <199SD 1. 





[1] 


R /i(Mpc) 
[1] 


[2] 


f h 3/2 
[3] 


f h 3/2 
[4] 


Coma 


0.063 ± 0.017 


1.50 


0.0603±0.0028 


0.177±0.019 




A1795 






0.0477±0.0036 


0.190±0.008 


0.184 ±0.011 


A399 






0.0655±0.0032 






A401 
A478 
A2142 
A2244 


0.166 ± 0.014 
0.060 ±0.011 


0.976 
0.76 


0794+ 0044 

U.U/3t_Q 0062 

0.0760tg;gg™ 
0.0890tg;ggg* 
0.0739ig:S™ 


0.247±0.012 
214+ 0012 

n 997 +0.024 
u - zz '-0.017 

iqfi+ 061 


0.230 ±0.013 
0.172 ± 0.023 
0.255 ± 0.033 
0.204 ±0.104 



The former suggests that the data are not good enough to 
distinguish between the global gas fraction hypothesis and 
that of all four detected clusters taking independent val- 
ues of /gas. However, the latter points strongly towards the 
combination of the non-detections' gas fractions. The overall 
evidence ratio from this 'split sample' analysis is therefore: 



P(data(al) 



P(data(all)|H i ) 



_ P(data(detections)|H| 1 ° bal ) 
P (data(detections) | H 1 ) 
P(data(non — detections) |H| 1 c ° bal ) 
P(data(non — detections) H 1 ) 
= 6.82 (19) 



This is higher than the result in 1161 . indicating that 
the split sample analysis is more appropriate. The interpre- 
tation is that the detected clusters are telling us about a 
global cluster gas fraction / ga s, while the non-detections are 
telling us far more about the primordial fluctuations (in- 
appropriately parameterised by X). Our 'headline' result is 
therefore that from combining the four detected clusters' gas 
fractions as above: /gas/iioo = 0.08jlg' g|. 

In order to address the true value of a global / gas 
we need bet t er da ta, which the like s of AMI (see e.g. 
iKneissl et all feOOlD). AMIBA ( see e.g. P J2002D ) and the 
SZA (see e.g. lMohr et"ai1 J2002|) ) should provide. We have, 
however, developed and demonstrated a useful method for 
estimating the effect of, and for controlling, systematics. We 
could do even better in estimating a universal / gas if we were 
able to use prior information (from X-rays and lensing) on 
the likely detectability in SZ of each cluster. This would re- 
quire us to be able to separate the 'position' and 'existence' 



implicit in the priors we use; we are planning to attempt 
this. 

We can also place formal constraints on f2 m /i by assum- 
ing that our estimation for f sas h is indeed the global value. 



/gasfr 



(20) 



iR.ebolo et, alJ |2004) infer Q b h 2 and hi 00 from VSA and 
WMAP primordial CMB data, using a flat ACDM model. 
We take these values and find f2 m /t = 0.33lg' 33 . 

Another implication concerns the clumping of the clus- 
ter gas. The broad agreement here between / gaB values from 
X-ray and from SZ, and as discussed in e.g. iGreeo et alJ 
(2001), rules out significant clumping. 



5 CONCLUSION 

We have investigated with the VSA Extended Array at w 
34 GHz the SZ effects towards seven nearby clusters that 
form a complete, X-ray-flux-limited sample. 

(i) Four of the clusters (Coma, A1795, A478, A2142) show 
SZ effects in the map plane on scales of ~20 arcmin of typ- 
ically 6(7. 

(ii) There is significant detection of CMB primordial 
structure at this resolution, which is the likely cause of the 
three non-detections (A399, A401, A2244). 

We have analysed the data in the uw-plane, with X-ray 
priors on positions and gas temperatures and radio priors on 
the sources, using MCMC to estimate key cluster parameters 
in the context of a /3-model for the gas distribution. In this 
context, the CMB primordial fluctuations are an additional 
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gas 100 

Figure 5. Plot of the probability distributions for / gas for each cluster, and that derived from combining the full sample. 



source of Gaussian noise, and are included in the model as 
a non-diagonal covariance matrix derived from the known 
angular power spectrum. We use the SZ data (plus the pri- 
ors) to give both the gas mass and, under the assumption of 
hydrostatic equilibrium, the total mass. Although the data 
have high random errors, the use of Bayesian methods, prob- 
ability density functions and marginalisation prevents bias 
in the results. 

(iii) The degeneracy is evident between j3 and core ra- 
dius as expected for such observations sensitive to SZ over a 
narrow f-range. There are significant measurements of gas 
fractions in the detected clusters. 

(iv) We present a normalisation of the M-T relation de- 
rived from our data which we find to be in good agreement 
with recent X-ray cluster measurements. 

(v) Using the gas fraction probability density function 
for each cluster, we have produced combined gas fractions 
for the four detections, for the three non-detections, and 
for all seven. The Bayesian evidence shows that the first 
is the correct one to use in the context of trying to mea- 
sure a low-z global gas fraction. For this we here find 
t — nns+o- 06 ?,- 1 

./gas — u - uo -0.04"l00- 

(vi) Gas fraction measurement by this SZ-based method 
is relatively immune from the effect of primordial CMB 



anisotropy. This is true since the effect on gas mass tends 
to cancel the effect on total mass on the narrow range of 
angular scale employed. Simulations show the cancellation 
to be good for contaminant fluxes of ±50 mjy. 

That the analysis method works as well as it does points 
the way towards analysis of data from upcoming SZ tele- 
scopes. 
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